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Abstract — Given n items with at most d of which being 
positive, instead of testing these items individually, the theory 
of comhinatorial group testing aims to identify all positive 
items using as few tests as possible. This paper is devoted to 
a fundamental and thirty-year-old problem in the nonadaptive 
group testing theory. A binary matrix is called d-disjunct if the 
boolean sum of arbitrary d columns does not contain another 
column not in this collection. Let T{d) denote the minimal t 
such that there exists a t x n d-disjunct matrix with n > t. 
T{d) can also be viewed as the minimal t such that there exists 
a nonadaptive group testing scheme which is better than the 
trivial one that tests each item individually. It was known that 
T{d) > {'^ 2 ^) and was conjectured that T(d) > (d+ 1)^. In this 
paper we narrow the gap by proving T{dL)/d^ > (15 + ^)/24, 
a quantity in [6/7,7/8]. 

Index Terms — nonadaptive group testing, disjunct matrix, 
graph matching number. 


I. Introduction 

Given n items with at most d of which being positive, 
instead of testing these items individually, the theory of 
combinatorial group testing aims to identify all positive items 
using as few tests as possible. Its history can date back to 
World War II when the biologists needed to identify people 
with syphilitic antigen from a large population. The idea was 
first introduced by Dorfman for testing blood samples Q. 
Since then, the theory of group testing has been extensively 
studied due to its many applications in a variety of fields, such 
as chemical leak testing ll22l . electric shorting detection i), 
multi-access channel communication jT), ll23l . DNA screening 
0, pattern finding ll20l and recently, network security ll24l . 

Assume each of the n items is associated with an undeter¬ 
mined binary status, positive (used to be called defective) or 
negative (used to be called pure). A test can be viewed as a 
subset of the items. The outcome of a test is 1 (positive) when¬ 
ever it contains a positive item and 0 (negative) otherwise. The 
problem is to identify all positive items. Our strategy is that 
we group the items into several tests. In each test, a positive 
outcome indicates that at least one of the items included in 
this test is positive and a negative outcome indicates that all 
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items included are negative. Usually the number of positive 
items is bounded by a positive integer d. To find this specified 
subset, one can trivially test every item individually and in this 
way n tests are needed, which will be a waste if d is much 
smaller than n. On the other hand the information-theoretic 
bound suggests that at least log ( 1 ) ~ ^ 5 

needed. There is a huge gap between n and dlog^, so we 
should carefully design our testing algorithms. 

In general there are two types of algorithms, namely, 
adaptive (sequential) or nonadaptive. An adaptive algorithm is 
designed to several rounds and the later tests are allowed to use 
the outcomes of all previous ones. Conversely, a nonadaptive 
algorithm carries out all tests simultaneously and all positive 
items should be identified in a single round. The adaptive 
algorithms inherently require fewer tests than the nonadaptive 
ones since more information can be used. Asymptotically, for 
a nonadaptive group testing scheme, the known bounds show 
that at least U( logn) tests are needed IfTOl . IfTTl . ll2Tl . But 
for the adaptive setting, there exist algorithms with as small as 
0{d\ogn) tests, optimal up to a constant factor 0. However, 
the nonadaptive algorithms have their own advantages. They 
are time-saving and are encouraged in the applications that 
time is the most emergent issue, such as DNA screening and 
network security. 

The application of nonadaptive group testing into molecular 
biology, especially in the design of screening experiment, has 
been extensively studied during the last twenty years. The 
readers are referred to the comprehensive book of Du and 
Hwang 0 for more detailed information. Recently, Xuan 
et. al. Il2^ have found that the idea of group testing can 
be adapted naturally to network security. In the simplest 
attack scenario there are n clients connecting to t servers 
and among n clients, there are d attackers. Just like the 
detecting of positive response in a single test, once an attacker 
starts attacking a server, the resources of this server will be 
exhausted dramatically. So it is not hard to identify which 
server is a victim. In the security setting the nonadaptive group 
testing attracts more attention since it is very important to 
detect the defective items as soon as possible before they cause 
great damage to the whole network. 

A nonadaptive group testing scheme can be represented as 
a t X n boolean matrix M whose rows are indexed by the 
tests and whose columns are indexed by the items, in which 
= 1 if the j-item is contained in the i-th test and 0 
otherwise. The matrix M is often designed to be a disjunct 
matrix. The notion of “disjunctness” was introduced by Kautz 
and Singleton HD when they were studying some important 
problems in information retrieval system. Later Erdos, Frankl 
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and Fiiredi 03 introduced a combinatorial object named as 
“cover-free family”, whose incidence matrix was exactly a 
disjunct matrix. We say a matrix is d-disjunct if the boolean 
sum of any d columns does not contain any other column. In 
other words, a matrix is d-disjunct if for any d -f 1 columns 
indexed by ci,, Cd+i and for every j = 1,..., d -I-1, there 
must exist a row that has exactly one 1 in the c^-th position 
and zeros in the other positions. It is not very difficult to 
see that a d-disjunct matrix gives a nonadaptive group testing 
scheme that identifies any positive set up to size d. On the other 
hand any nonadaptive group testing scheme that identifies any 
positive set up to size d must also be a (d— l)-disjunct matrix 
iSl . Denote f(d, n) the minimal t such that a t x n d-disjunct 
matrix exists. The recent results of D’yachkov et. al. M, ES 
showed that the following bounds hold asymptotically 


d^ log n 
2 logd 


(1 -I- o(l)) < t{d, n) < 


e^d^ logn 
4 logd 


( 1 + 0 ( 1 )). 


For general n, it holds that f(d, n) > min{ , n}, which 
was attributed to Bassalygo by D’yachkov and Rykov ini- 
This bound implies that if n < then no d-disjunct algo¬ 

rithm is superior to the trivial algorithm that tests every item 
individually. An interesting problem is suggested by the above 
result; given d, when does there exist a d-disjunct algorithm 
better than the trivial one? It is equivalent to ask; given d, what 
is the minimal t such that there exists atxn d-disjunct matrix 
with n > t + 1. Denote this minimal t by T(d). Obviously, 
we have T(d) > and t{d,n) > min{T(d),n}. In 1985, 
Erdos, Frankl and Fiiredi conjectured that IT4l 


lim T[d)/d'^ = 1 {weaker version), 

d—¥co 

T{d) > (d + 1)^ {stronger version), 

and they stated without proof that the stronger version holds 
for d < 3 and limd_>oo T(d)/d^ > 5/6. Note that the 
incidence matrix of an affine plane of order d + 1 is a 
(d+ 1)^ X ((d+ 1)^ + (d+ 1)) d-disjunct matrix with constant 
column weight d+1. And an affine plane of order d+1 exists if 
d+1 is a prime power ||2|. This implies limd_>.oo T{d)/d?‘ < 1 
and T(d) < (d+1)^ when d+1 is a prime power. Later, Huang 
and Hwang ifTSll proved the stronger version for d = 4, while 
Chen and Hwang El for d = 5. In this paper, based on a 
graph matching theorem of Erdos and Gallai IITSl . we show 
that T{d) > by counting the number of specified 

substructures contained in the columns of the matrix. Our 
result significantly improves the previous ones. It is also worth 
mentioning that disjunct matrices with constant column weight 
are of particular interest in the framework of DNA screening 
E). In the thesis of Chee El. the author considered the 
above conjecture for d-disjunct matrix with constant column 
weight d+1 and the problem was not completely settled (see. 
Theorem 5.3.1 of El)- By an easy counting argument we will 
verify the conjecture under this constant weight constraint. 

Our main results are presented as follows. 

Theorem I.l. Suppose M is a t x n d-disjunct matrix with 
constant column weight d+1. If n > t, then t > (d + 1)^. 


Theorem 1.2. Suppose M is a t x n d-disjunct matrix. If 
n > t, then it holds that t > 

The rest of this paper is organised as follows. In Section 
2 we will prove Theorem and in Section 3 we will prove 
Theorem 11.21 We conclude this paper in Section 4. 


H. A SIMPLE BOUND FOR CONSTANT WEIGHT MATRIX 

For atxn binary matrix M, Is and Os can represent the 
incidence structure of a corresponding set system. Let T = 
{1,..., f} be a set of f elements and let IF = {Fi,..., F„} C 
2^ be a collection of subsets of T. Then M can be viewed 
as the incidence matrix of {T, F) such that for all 1 < i < f, 
^ 3 n, i & Fj if and only if M{i,j) = 1. We can simply 

replace the set Fj by the column cj, then we just write i G Cj 
if M{i,j) = 1, which also indicates that row i is contained in 
column Cj. A column of M is called isolated if there exists 
a row incident to it but not to any other column. If M is d- 
disjunct and has an isolated column c, then by deleting c and 
the isolated row contained in it we get a (f — 1) x (n — 1) 
matrix M' which maintains the d-disjunctness. Then n > t 
holds for the original matrix M is equivalent ton—1 >t—1 
holds for the new matrix M'. By the definition of T(d), the 
minimal t satisfying (n — 1) > (t — 1) is at least T(d) + 1. 
We can summarise this observation as the following lemma. 

Lemma II.l. Suppose M is a txn d-disjunct matrix with an 
isolated column c. If n > t, then t > T{d). 

Proof: Deleting column c and the corresponding isolated 
rows yields a {t — rfj x (n — 1) d-disjunct matrix, where 
Tc > 1 is the number of isolated rows contained in c. Then 
by the definition of T{d) we have t — rc > T{d) and hence 
t >T{d) -\- rc since n — 1 > t — rc. ■ 

Therefore, to determine T(d) we only need to consider the 
matrices with no isolated columns. The weight of a column 
c, denoted as |c|, is defined to be the number of Is contained 
in it. One can see that a non-isolated column in a d-disjunct 
matrix has weight at least d+1, since any 1 in this column 
is contained in some other columns. So for a matrix with 
no isolated columns, the minimal weight of the columns is 
at least d+1. Theorem II. II establishes the validity for the 
stronger version of the conjecture in the simplest case, i.e., 
the d-disjunct matrix being considered is of constant column 
weight d+1. 

We present the proof of Theorem lI.il as follows. 

Proof of Theorem I/.71 We can always assume that M 
has no isolated columns by Lemma III. II Then for arbitrary 
two distinct columns c, c' it is easy to see |cnc'| < 1. Denote 
C{i) as the collection of columns that has a 1 in the i-th row. 
By counting the number of Is in the whole matrix we get 
X)i=i |C'(0I = '^{d + 1) > {t + l)(t^ + !)■ Therefore, there 
exists some 1 < io < t such that |C'(*o)| > j"j > 
d+2. Note that cflc' = {zq} holds for all distinct c, c' G C{io), 
then the theorem follows from t > I VcgC(zo) c| — 1 + (d + 
2)d = (d+ 1)^, where V denotes the boolean sum (the union) 
of the columns. ■ 
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III. A GENERAL BOUND FOR T{d) 

Suppose AT is a fc-element set, we use (^) to denote the 
collection of all A-element subsets of K, where 1 < A < fc 
is a positive integer. Let Q C be a family of A-element 
subsets of K. The matching number v{Q) is defined to be the 
maximum number of pairwise disjoint members of Q. One of 
the classical problems of extremal set theory is to determine 
max|C/| for fixed v{K). Define m{k,X^fj,) = max{|C/| ; Q C 
(^), I ATI = k, viG) < In 1959, Erdos and Gallai lIBl 
determined m{k, A, p) for A = 2 (see. Theorem 4.1 of lITSl l 

Lemma III.l. ( ifTSl l m{k, 2,fi)< max{ (^^ 2 *"^) > ( 2 ) ~ (^ 2 ^) i 
for fc > 2/i + 1. 

A very important notion in studying disjunct matrix is 
“privateness”, which was introduced as “own part” in m. 
For a given matrix M, a subset of is private if 

it belongs to a unique column. On the contrary, a subset of 
{1,... ,f} is called non-private if it belongs to at least two 
columns. When proving Theorem II.II we actually consider 
the private 1-subsets since a column is isolated if and only if 
it contains a private 1-subset. In order to establish our general 
bound, we investigate the properties of private 2-subsets. More 
precisely, a lower bound for the number of private 2-subsets 
that a column must contain is obtained. For a column c, denote 
P(c) = {T C {1,..., f} : |T| = 2, T C c and T is private} 
as the collection of private 2-subsets contained in c and denote 
N{c) as the collection of non-private 2-subsets contained in 
c. If column c has weight fc, then (*) = |P(c)| -f |W(c)| since 
P(c) and N{c) partition all 2-subsets of c. The lemma below 
presents an upper bound for the size of N{c). 

Lemma III.2. Suppose M is a txn d-disjunct matrix with no 
isolated columns. Then for any arbitrary column c satisfying 
|c| — d + s, where 1 < s < d — it holds that |iV(c)| < 
m{d + s,2,s-l) < max{(%-i), (''+«) - ("+i)}. 

Proof: It suffices to show N{c) does not contain s pair¬ 
wise disjoint members. If otherwise, the left {d+s)—2s = d—s 
Is of c is contained in the union of some d — s columns of 
M since c has no private 1-subsets. Then c is contained in the 
union of some s + {d — s) = d columns, which violates the 
d-disjunct property. ■ 

For s > 1, by direct computation one can verify the 
following formula holds 




One more lemma is needed to prove Theorem 11.21 

Lemma III.3. Suppose M is a txn d-disjunct matrix. Assume 
c is an arbitrary column of M with weight Wc, then deleting 
c and all rows intersecting it yields a {t — wf) x (n — 1) 
(d — \)-disjunct matrix. 

Proof: See Lemma 2.2.2 of ||9l- ■ 

Proof of Theorem I/. 2 1 Again, we can assume that M 
has no isolated columns by Lemma urn Then the minimal 


column weight of M is at least d-fl. We will apply induction 
on d to prove the theorem. Our statement is true for 1 < d < 5 
by previous results. Assume the statement is true for d — 1. 
Let c be the column with the largest column weight and for 
the sake of simplicity, denote k = (15 -f v^)/24. Then our 
goal is to prove t > The proof can be divided into two 
cases: 

Case 1. |c| > [2Kd]. By Lemma lill.3l deleting c and all 
rows intersecting it we get a (f — |c|) x (n — 1) (d — 1)- 
disjunct matrix. Obviously, n — 1 > f — |c| since n > L By the 
induction hypothesis we can deduce that t> |c| -f n{d— 1)^ > 
2Kd -I- K{d — 1)^ > nd^. 

Case 2. |c| < [2«;dJ. Then every column of M has weight 
at most [2«;dJ. Fix a column u with |u| = d -I- s, where 1 < 
s < {2 k— l)d. Let us estimate the number of private 2-subsets 
contained in u. On one hand, if |u| < ^ + |, then by the first 
formula of (1) we have |P(c)| = — |-A^(c)| > 

On the other hand, if |m| > ^ -f |, then 2d/3 < s < ( 2 k — 
l)d, by the second formula of (1) we have |P(c)| > — 

(2s-i) ^ _ 352^^2 > (3 k — 1)(2 — 2K)d^ = Kd^/2. 

Note that |P(c)| > Kd^/2 holds in both cases since k < 1. 
Then the statement follows from the fact that the number of 
private 2-subsets in {1,..., f} can not exceed i-e-, (D > 

Sc l'P('')l ^ n- X Kd^/2 > {t l)Kd^/2. ■ 

The following result is straightforward. 

Corollary III.4. Denote t{d,n) the minimal t such that there 
exists a t X n d-disjunct matrix. Then it holds that t{d, n) > 
min{i^±|^d2,n}. 

Proof: The corollary holds since t{d, n) > min{T(d), n}. 

■ 

Through a similar argument to that of Theorem 11.21 one can 
prove the following corollary. 

Corollary III.5. Suppose M is a t x n d-disjunct matrix. If 
n > t and for every column c of M, there is |c| < Then 

it holds that t > d'^ -\- d -\-1. 

Proof: By (1) we have |P(c)| > for every non¬ 

isolated column c. Then the conclusion follows from (*) > 

IV. Concluding remarks 

In this paper we consider the lower bound of the minimal 
t when there exists a t x n d-disjunct matrix with n > t, 
and our new bound improves the previous results significantly. 
The novelty of our method is that we consider the properties of 
private 2-subsets of the given disjunct matrix and apply a graph 
matching theorem of Erdos and Gallai Ea. A natural idea to 
generalize our method is to consider larger private subsets and 
then a hypergraph version of matching theorem will be needed 
M- It will be interesting if someone can improve our results 
in this way. 
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